5 research outputs found

    Check Mate: Prioritizing User Generated Multi-Media Content for Fact-Checking

    Full text link
    Volume of content and misinformation on social media is rapidly increasing. There is a need for systems that can support fact checkers by prioritizing content that needs to be fact checked. Prior research on prioritizing content for fact-checking has focused on news media articles, predominantly in English language. Increasingly, misinformation is found in user-generated content. In this paper we present a novel dataset that can be used to prioritize check-worthy posts from multi-media content in Hindi. It is unique in its 1) focus on user generated content, 2) language and 3) accommodation of multi-modality in social media posts. In addition, we also provide metadata for each post such as number of shares and likes of the post on ShareChat, a popular Indian social media platform, that allows for correlative analysis around virality and misinformation. The data is accessible on Zenodo (https://zenodo.org/record/4032629) under Creative Commons Attribution License (CC BY 4.0).Comment: 8 pages, 13 figures, 2 table

    The Uli Dataset: An Exercise in Experience Led Annotation of oGBV

    Full text link
    Online gender based violence has grown concomitantly with adoption of the internet and social media. Its effects are worse in the Global majority where many users use social media in languages other than English. The scale and volume of conversations on the internet has necessitated the need for automated detection of hate speech, and more specifically gendered abuse. There is, however, a lack of language specific and contextual data to build such automated tools. In this paper we present a dataset on gendered abuse in three languages- Hindi, Tamil and Indian English. The dataset comprises of tweets annotated along three questions pertaining to the experience of gender abuse, by experts who identify as women or a member of the LGBTQIA community in South Asia. Through this dataset we demonstrate a participatory approach to creating datasets that drive AI systems

    To think of interdisciplinarity as intercurrence: Or, working as an interdisciplinary team to develop an ML tool to tackle online gender-based violence and hate speech

    No full text
    The paper reflects on the working of an interdisciplinary team consisting of researchers and activists from the field of computer science and social sciences involved in developing a user-facing, browser plug-in to detect and moderate instances of online gender-based violence, hate speech and harassment in Hindi, Indian English, and Tamil. There have been multiple calls within the field of Human-Computer Interaction (HCI) to include qualitative methods in one’s research design. These calls, while attuned to the importance of qualitative methods for HCI, ignore the intercurrent nature of different research methods, disciplines and practices. The paper borrows the concept of intercurrence from Orren & Skowronek (1996) and reorients it to explicate the practice of interdisciplinary research. It argues that intercurrence i.e. (in between, an occurrence within an occurrence) is a useful image to perceive interdisciplinarity wherein we argue that at any given point, an interdisciplinary team navigates multiple, yet simultaneously occurring temporal dimensions of differently disciplined bodies. An awareness of these multiple temporalities adds another dimension to thinking about conflicts and possibilities emerging from interdisciplinary practices and reorients interdisciplinary research towards unexpected outcomes

    AOIR ETHICS PANEL 2: PLATFORM CHALLENGES

    Get PDF
    This panel is one of two sessions organized by the AoIR Ethics Working Committee. It collects five papers exploring a broad (but in many ways common) set of ethical dilemmas faced by researchers engaged with specific platforms such as Reddit, Amazon’s Mechanical Turk, and private messaging platforms. These include: a study of people's online conversations about health matters on Reddit in support of a proposed situated ethics framework for researchers working with publicly available data; an exploration into sourcing practices among Reddit researchers to determine if their sources could be unmasked and located in Reddit archives; a broader systematic review of over 700 research studies that used Reddit data to assess the kinds of analysis and methods researchers are engaging in as well as any ethical considerations that emerge when researching Reddit; a critical examination of the use of Amazon’s Mechanical Turk for academic research; and an investigation into current practices and ethical dilemmas faced when researching closed messaging applications and their users. Taken together, these papers illuminate emerging ethical dilemmas facing researchers when investigating novel platforms and user communities; challenges often not fully addressed–if even contemplated–in existing ethical guidelines. These papers are among those under consideration for publication in a special issue of the Journal of Information, Communication and Ethics in Society associated with the AoIR Ethics Working Committee and AoIR2021
    corecore